# Document Intelligence
Kosmos 2.5
MIT
Kosmos-2.5 is a multimodal reading and writing model designed for machine reading of text-dense images, capable of text recognition and structured output from images.
Image-to-Text
Transformers English

K
microsoft
5,531
191
Markuplm Base Finetuned Websrc
MarkupLM is a multimodal pretrained model designed for rich visual document understanding and information extraction tasks, combining text and markup language information.
Multimodal Fusion
Transformers English

M
microsoft
168
10
Featured Recommended AI Models